Dataset statistics
| Dataset A | Dataset B | |
|---|---|---|
| Number of variables | 12 | 12 |
| Number of observations | 446 | 446 |
| Missing cells | 443 | 442 |
| Missing cells (%) | 8.3% | 8.3% |
| Duplicate rows | 0 | 0 |
| Duplicate rows (%) | 0.0% | 0.0% |
| Total size in memory | 45.3 KiB | 45.3 KiB |
| Average record size in memory | 104.0 B | 104.0 B |
Variable types
| Dataset A | Dataset B | |
|---|---|---|
| Numeric | 5 | 5 |
| Categorical | 4 | 4 |
| Text | 3 | 3 |
| Dataset A | Dataset B | |
|---|---|---|
Sex is highly overall correlated with Survived | Sex is highly overall correlated with Survived | High Correlation |
Survived is highly overall correlated with Sex | Survived is highly overall correlated with Sex | High Correlation |
Age has 92 (20.6%) missing values | Age has 91 (20.4%) missing values | Missing |
Cabin has 350 (78.5%) missing values | Cabin has 351 (78.7%) missing values | Missing |
PassengerId has unique values | PassengerId has unique values | Unique |
Name has unique values | Name has unique values | Unique |
SibSp has 319 (71.5%) zeros | SibSp has 302 (67.7%) zeros | Zeros |
Parch has 361 (80.9%) zeros | Parch has 337 (75.6%) zeros | Zeros |
Fare has 11 (2.5%) zeros | Fare has 7 (1.6%) zeros | Zeros |
Reproduction
| Dataset A | Dataset B | |
|---|---|---|
| Analysis started | 2024-07-15 18:27:12.187124 | 2024-07-15 18:27:15.548406 |
| Analysis finished | 2024-07-15 18:27:15.547185 | 2024-07-15 18:27:18.622790 |
| Duration | 3.36 seconds | 3.07 seconds |
| Software version | ydata-profiling v0.0.dev0 | ydata-profiling v0.0.dev0 |
| Download configuration | config.json | config.json |
PassengerId
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 455.32287 | 430.2287 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 1 | 1 |
| Maximum | 891 | 891 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 1 | 1 |
| 5-th percentile | 44.25 | 43.25 |
| Q1 | 219.25 | 204.25 |
| median | 465.5 | 410 |
| Q3 | 686.5 | 654.75 |
| 95-th percentile | 841.5 | 850.25 |
| Maximum | 891 | 891 |
| Range | 890 | 890 |
| Interquartile range (IQR) | 467.25 | 450.5 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 259.72165 | 261.00336 |
| Coefficient of variation (CV) | 0.57041204 | 0.60666189 |
| Kurtosis | -1.2467364 | -1.2234281 |
| Mean | 455.32287 | 430.2287 |
| Median Absolute Deviation (MAD) | 236 | 225 |
| Skewness | -0.05590083 | 0.089698726 |
| Sum | 203074 | 191882 |
| Variance | 67455.334 | 68122.752 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 373 | 1 | 0.2% |
| 14 | 1 | 0.2% |
| 299 | 1 | 0.2% |
| 685 | 1 | 0.2% |
| 532 | 1 | 0.2% |
| 842 | 1 | 0.2% |
| 97 | 1 | 0.2% |
| 321 | 1 | 0.2% |
| 570 | 1 | 0.2% |
| 2 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 13 | 1 | 0.2% |
| 762 | 1 | 0.2% |
| 565 | 1 | 0.2% |
| 756 | 1 | 0.2% |
| 142 | 1 | 0.2% |
| 475 | 1 | 0.2% |
| 813 | 1 | 0.2% |
| 684 | 1 | 0.2% |
| 570 | 1 | 0.2% |
| 2 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 7 | 1 | |
| 10 | 1 | |
| 12 | 1 | |
| 14 | 1 | |
| 15 | 1 | |
| 20 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 4 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 11 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 23 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 4 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 11 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 23 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 7 | 1 | |
| 10 | 1 | |
| 12 | 1 | |
| 14 | 1 | |
| 15 | 1 | |
| 20 | 1 |
Survived
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 0 | |
|---|---|
| 1 |
| 0 | |
|---|---|
| 1 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 2 | 2 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 0 | 0 |
| 2nd row | 0 | 1 |
| 3rd row | 1 | 1 |
| 4th row | 1 | 0 |
| 5th row | 0 | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 280 | |
| 1 | 166 |
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 0 | 280 | |
| 1 | 166 |
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 280 | |
| 1 | 166 |
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 280 | |
| 1 | 166 |
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 280 | |
| 1 | 166 |
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 280 | |
| 1 | 166 |
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
Pclass
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 3 | |
|---|---|
| 1 | |
| 2 |
| 3 | |
|---|---|
| 1 | |
| 2 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 3 | 3 |
| 2nd row | 1 | 2 |
| 3rd row | 1 | 2 |
| 4th row | 1 | 3 |
| 5th row | 1 | 3 |
Common Values
| Value | Count | Frequency (%) |
| 3 | 253 | |
| 1 | 109 | |
| 2 | 84 | 18.8% |
| Value | Count | Frequency (%) |
| 3 | 257 | |
| 1 | 99 | 22.2% |
| 2 | 90 | 20.2% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 3 | 253 | |
| 1 | 109 | |
| 2 | 84 | 18.8% |
| Value | Count | Frequency (%) |
| 3 | 257 | |
| 1 | 99 | 22.2% |
| 2 | 90 | 20.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 253 | |
| 1 | 109 | |
| 2 | 84 | 18.8% |
| Value | Count | Frequency (%) |
| 3 | 257 | |
| 1 | 99 | 22.2% |
| 2 | 90 | 20.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 253 | |
| 1 | 109 | |
| 2 | 84 | 18.8% |
| Value | Count | Frequency (%) |
| 3 | 257 | |
| 1 | 99 | 22.2% |
| 2 | 90 | 20.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 253 | |
| 1 | 109 | |
| 2 | 84 | 18.8% |
| Value | Count | Frequency (%) |
| 3 | 257 | |
| 1 | 99 | 22.2% |
| 2 | 90 | 20.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 253 | |
| 1 | 109 | |
| 2 | 84 | 18.8% |
| Value | Count | Frequency (%) |
| 3 | 257 | |
| 1 | 99 | 22.2% |
| 2 | 90 | 20.2% |
Name
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 67 | 65 |
| Median length | 45 | 47 |
| Mean length | 26.59417 | 27.338565 |
| Min length | 12 | 12 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 11861 | 12193 |
| Distinct characters | 60 | 59 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 446 | 446 ? |
| Unique (%) | 100.0% | 100.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | Beavan, Mr. William Thomas | Saundercock, Mr. William Henry |
| 2nd row | Meyer, Mr. Edgar Joseph | Kantor, Mrs. Sinai (Miriam Sternin) |
| 3rd row | Hawksford, Mr. Walter James | Mellinger, Mrs. (Elizabeth Anne Maidment) |
| 4th row | Sloper, Mr. William Thompson | Kallio, Mr. Nikolai Erland |
| 5th row | Butt, Major. Archibald Willingham | Kelly, Mr. James |
| Value | Count | Frequency (%) |
| mr | 269 | 15.0% |
| miss | 97 | 5.4% |
| mrs | 57 | 3.2% |
| william | 27 | 1.5% |
| john | 19 | 1.1% |
| henry | 15 | 0.8% |
| master | 15 | 0.8% |
| george | 13 | 0.7% |
| james | 12 | 0.7% |
| thomas | 11 | 0.6% |
| Other values (909) | 1257 |
| Value | Count | Frequency (%) |
| mr | 260 | 14.3% |
| miss | 98 | 5.4% |
| mrs | 61 | 3.3% |
| william | 32 | 1.8% |
| john | 26 | 1.4% |
| master | 20 | 1.1% |
| henry | 19 | 1.0% |
| charles | 12 | 0.7% |
| george | 10 | 0.5% |
| joseph | 10 | 0.5% |
| Other values (903) | 1273 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1346 | 11.3% | |
| r | 955 | 8.1% |
| e | 838 | 7.1% |
| a | 817 | 6.9% |
| i | 669 | 5.6% |
| s | 659 | 5.6% |
| n | 651 | 5.5% |
| M | 568 | 4.8% |
| l | 521 | 4.4% |
| o | 490 | 4.1% |
| Other values (50) | 4347 |
| Value | Count | Frequency (%) |
| 1375 | 11.3% | |
| r | 998 | 8.2% |
| e | 860 | 7.1% |
| a | 850 | 7.0% |
| i | 683 | 5.6% |
| n | 665 | 5.5% |
| s | 656 | 5.4% |
| M | 571 | 4.7% |
| l | 539 | 4.4% |
| o | 517 | 4.2% |
| Other values (49) | 4479 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 11861 |
| Value | Count | Frequency (%) |
| (unknown) | 12193 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1346 | 11.3% | |
| r | 955 | 8.1% |
| e | 838 | 7.1% |
| a | 817 | 6.9% |
| i | 669 | 5.6% |
| s | 659 | 5.6% |
| n | 651 | 5.5% |
| M | 568 | 4.8% |
| l | 521 | 4.4% |
| o | 490 | 4.1% |
| Other values (50) | 4347 |
| Value | Count | Frequency (%) |
| 1375 | 11.3% | |
| r | 998 | 8.2% |
| e | 860 | 7.1% |
| a | 850 | 7.0% |
| i | 683 | 5.6% |
| n | 665 | 5.5% |
| s | 656 | 5.4% |
| M | 571 | 4.7% |
| l | 539 | 4.4% |
| o | 517 | 4.2% |
| Other values (49) | 4479 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 11861 |
| Value | Count | Frequency (%) |
| (unknown) | 12193 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1346 | 11.3% | |
| r | 955 | 8.1% |
| e | 838 | 7.1% |
| a | 817 | 6.9% |
| i | 669 | 5.6% |
| s | 659 | 5.6% |
| n | 651 | 5.5% |
| M | 568 | 4.8% |
| l | 521 | 4.4% |
| o | 490 | 4.1% |
| Other values (50) | 4347 |
| Value | Count | Frequency (%) |
| 1375 | 11.3% | |
| r | 998 | 8.2% |
| e | 860 | 7.1% |
| a | 850 | 7.0% |
| i | 683 | 5.6% |
| n | 665 | 5.5% |
| s | 656 | 5.4% |
| M | 571 | 4.7% |
| l | 539 | 4.4% |
| o | 517 | 4.2% |
| Other values (49) | 4479 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 11861 |
| Value | Count | Frequency (%) |
| (unknown) | 12193 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1346 | 11.3% | |
| r | 955 | 8.1% |
| e | 838 | 7.1% |
| a | 817 | 6.9% |
| i | 669 | 5.6% |
| s | 659 | 5.6% |
| n | 651 | 5.5% |
| M | 568 | 4.8% |
| l | 521 | 4.4% |
| o | 490 | 4.1% |
| Other values (50) | 4347 |
| Value | Count | Frequency (%) |
| 1375 | 11.3% | |
| r | 998 | 8.2% |
| e | 860 | 7.1% |
| a | 850 | 7.0% |
| i | 683 | 5.6% |
| n | 665 | 5.5% |
| s | 656 | 5.4% |
| M | 571 | 4.7% |
| l | 539 | 4.4% |
| o | 517 | 4.2% |
| Other values (49) | 4479 |
Sex
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| male | |
|---|---|
| female |
| male | |
|---|---|
| female |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 6 | 6 |
| Median length | 4 | 4 |
| Mean length | 4.6950673 | 4.7264574 |
| Min length | 4 | 4 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 2094 | 2108 |
| Distinct characters | 5 | 5 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | male | male |
| 2nd row | male | female |
| 3rd row | male | female |
| 4th row | male | male |
| 5th row | male | male |
Common Values
| Value | Count | Frequency (%) |
| male | 291 | |
| female | 155 |
| Value | Count | Frequency (%) |
| male | 284 | |
| female | 162 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| male | 291 | |
| female | 155 |
| Value | Count | Frequency (%) |
| male | 284 | |
| female | 162 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 601 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 155 | 7.4% |
| Value | Count | Frequency (%) |
| e | 608 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 162 | 7.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2094 |
| Value | Count | Frequency (%) |
| (unknown) | 2108 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 601 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 155 | 7.4% |
| Value | Count | Frequency (%) |
| e | 608 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 162 | 7.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2094 |
| Value | Count | Frequency (%) |
| (unknown) | 2108 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 601 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 155 | 7.4% |
| Value | Count | Frequency (%) |
| e | 608 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 162 | 7.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2094 |
| Value | Count | Frequency (%) |
| (unknown) | 2108 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 601 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 155 | 7.4% |
| Value | Count | Frequency (%) |
| e | 608 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 162 | 7.7% |
Age
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 77 | 74 |
| Distinct (%) | 21.8% | 20.8% |
| Missing | 92 | 91 |
| Missing (%) | 20.6% | 20.4% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 29.95435 | 28.65707 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.42 | 0.42 |
| Maximum | 71 | 74 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.42 | 0.42 |
| 5-th percentile | 4 | 4 |
| Q1 | 21 | 19 |
| median | 29 | 28 |
| Q3 | 39 | 36 |
| 95-th percentile | 55.675 | 52.6 |
| Maximum | 71 | 74 |
| Range | 70.58 | 73.58 |
| Interquartile range (IQR) | 18 | 17 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 14.495276 | 14.027763 |
| Coefficient of variation (CV) | 0.48391221 | 0.48950442 |
| Kurtosis | 0.0066627419 | 0.48710028 |
| Mean | 29.95435 | 28.65707 |
| Median Absolute Deviation (MAD) | 9 | 8 |
| Skewness | 0.32661672 | 0.44577308 |
| Sum | 10603.84 | 10173.26 |
| Variance | 210.11303 | 196.77812 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 24 | 16 | 3.6% |
| 19 | 15 | 3.4% |
| 30 | 15 | 3.4% |
| 28 | 13 | 2.9% |
| 22 | 13 | 2.9% |
| 21 | 12 | 2.7% |
| 18 | 12 | 2.7% |
| 25 | 12 | 2.7% |
| 35 | 11 | 2.5% |
| 32 | 11 | 2.5% |
| Other values (67) | 224 | |
| (Missing) | 92 |
| Value | Count | Frequency (%) |
| 24 | 20 | 4.5% |
| 18 | 17 | 3.8% |
| 25 | 16 | 3.6% |
| 28 | 14 | 3.1% |
| 30 | 13 | 2.9% |
| 36 | 12 | 2.7% |
| 21 | 12 | 2.7% |
| 16 | 11 | 2.5% |
| 19 | 11 | 2.5% |
| 35 | 10 | 2.2% |
| Other values (64) | 219 | |
| (Missing) | 91 |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.67 | 1 | 0.2% |
| 0.75 | 2 | |
| 0.83 | 1 | 0.2% |
| 0.92 | 1 | 0.2% |
| 1 | 3 | |
| 2 | 4 | |
| 3 | 3 | |
| 4 | 3 | |
| 6 | 3 |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.67 | 1 | 0.2% |
| 0.75 | 1 | 0.2% |
| 0.92 | 1 | 0.2% |
| 1 | 4 | |
| 2 | 5 | |
| 3 | 3 | |
| 4 | 4 | |
| 5 | 3 | |
| 7 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.67 | 1 | 0.2% |
| 0.75 | 1 | 0.2% |
| 0.92 | 1 | 0.2% |
| 1 | 4 | |
| 2 | 5 | |
| 3 | 3 | |
| 4 | 4 | |
| 5 | 3 | |
| 7 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.67 | 1 | 0.2% |
| 0.75 | 2 | |
| 0.83 | 1 | 0.2% |
| 0.92 | 1 | 0.2% |
| 1 | 3 | |
| 2 | 4 | |
| 3 | 3 | |
| 4 | 3 | |
| 6 | 3 |
SibSp
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 7 | 7 |
| Distinct (%) | 1.6% | 1.6% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.43273543 | 0.52690583 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 8 | 8 |
| Zeros | 319 | 302 |
| Zeros (%) | 71.5% | 67.7% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 1 | 1 |
| 95-th percentile | 2 | 2.75 |
| Maximum | 8 | 8 |
| Range | 8 | 8 |
| Interquartile range (IQR) | 1 | 1 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 0.9087367 | 1.0904871 |
| Coefficient of variation (CV) | 2.0999822 | 2.0696053 |
| Kurtosis | 16.242674 | 16.902633 |
| Mean | 0.43273543 | 0.52690583 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 3.404427 | 3.5811554 |
| Sum | 193 | 235 |
| Variance | 0.82580239 | 1.1891621 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 319 | |
| 1 | 96 | 21.5% |
| 2 | 12 | 2.7% |
| 3 | 9 | 2.0% |
| 4 | 7 | 1.6% |
| 5 | 2 | 0.4% |
| 8 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 302 | |
| 1 | 107 | 24.0% |
| 2 | 14 | 3.1% |
| 4 | 10 | 2.2% |
| 3 | 7 | 1.6% |
| 5 | 3 | 0.7% |
| 8 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 319 | |
| 1 | 96 | 21.5% |
| 2 | 12 | 2.7% |
| 3 | 9 | 2.0% |
| 4 | 7 | 1.6% |
| 5 | 2 | 0.4% |
| 8 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 302 | |
| 1 | 107 | 24.0% |
| 2 | 14 | 3.1% |
| 3 | 7 | 1.6% |
| 4 | 10 | 2.2% |
| 5 | 3 | 0.7% |
| 8 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 302 | |
| 1 | 107 | 24.0% |
| 2 | 14 | 3.1% |
| 3 | 7 | 1.6% |
| 4 | 10 | 2.2% |
| 5 | 3 | 0.7% |
| 8 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 319 | |
| 1 | 96 | 21.5% |
| 2 | 12 | 2.7% |
| 3 | 9 | 2.0% |
| 4 | 7 | 1.6% |
| 5 | 2 | 0.4% |
| 8 | 1 | 0.2% |
Parch
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 7 | 7 |
| Distinct (%) | 1.6% | 1.6% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.31838565 | 0.38565022 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 6 | 6 |
| Zeros | 361 | 337 |
| Zeros (%) | 80.9% | 75.6% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 0 | 0 |
| 95-th percentile | 2 | 2 |
| Maximum | 6 | 6 |
| Range | 6 | 6 |
| Interquartile range (IQR) | 0 | 0 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 0.7800738 | 0.81512136 |
| Coefficient of variation (CV) | 2.450091 | 2.1136286 |
| Kurtosis | 13.949035 | 11.765569 |
| Mean | 0.31838565 | 0.38565022 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 3.2857709 | 2.9356946 |
| Sum | 142 | 172 |
| Variance | 0.60851514 | 0.66442283 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 361 | |
| 1 | 44 | 9.9% |
| 2 | 34 | 7.6% |
| 5 | 2 | 0.4% |
| 4 | 2 | 0.4% |
| 3 | 2 | 0.4% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 337 | |
| 1 | 62 | 13.9% |
| 2 | 41 | 9.2% |
| 5 | 3 | 0.7% |
| 3 | 1 | 0.2% |
| 4 | 1 | 0.2% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 361 | |
| 1 | 44 | 9.9% |
| 2 | 34 | 7.6% |
| 3 | 2 | 0.4% |
| 4 | 2 | 0.4% |
| 5 | 2 | 0.4% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 337 | |
| 1 | 62 | 13.9% |
| 2 | 41 | 9.2% |
| 3 | 1 | 0.2% |
| 4 | 1 | 0.2% |
| 5 | 3 | 0.7% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 337 | |
| 1 | 62 | 13.9% |
| 2 | 41 | 9.2% |
| 3 | 1 | 0.2% |
| 4 | 1 | 0.2% |
| 5 | 3 | 0.7% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 361 | |
| 1 | 44 | 9.9% |
| 2 | 34 | 7.6% |
| 3 | 2 | 0.4% |
| 4 | 2 | 0.4% |
| 5 | 2 | 0.4% |
| 6 | 1 | 0.2% |
Ticket
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 391 | 384 |
| Distinct (%) | 87.7% | 86.1% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 18 | 18 |
| Median length | 17 | 17 |
| Mean length | 6.8004484 | 6.8520179 |
| Min length | 3 | 3 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 3033 | 3056 |
| Distinct characters | 35 | 31 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 348 | 333 ? |
| Unique (%) | 78.0% | 74.7% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 323951 | A/5. 2151 |
| 2nd row | PC 17604 | 244367 |
| 3rd row | 16988 | 250644 |
| 4th row | 113788 | STON/O 2. 3101274 |
| 5th row | 113050 | 363592 |
| Value | Count | Frequency (%) |
| pc | 30 | 5.3% |
| a/5 | 12 | 2.1% |
| c.a | 11 | 1.9% |
| ston/o | 7 | 1.2% |
| 2 | 7 | 1.2% |
| soton/o.q | 6 | 1.1% |
| soton/oq | 5 | 0.9% |
| 1601 | 5 | 0.9% |
| sc/paris | 5 | 0.9% |
| 347088 | 5 | 0.9% |
| Other values (409) | 472 |
| Value | Count | Frequency (%) |
| pc | 31 | 5.4% |
| c.a | 15 | 2.6% |
| ca | 8 | 1.4% |
| ston/o | 8 | 1.4% |
| 2 | 8 | 1.4% |
| sc/paris | 7 | 1.2% |
| a/5 | 5 | 0.9% |
| 3101295 | 5 | 0.9% |
| 2144 | 4 | 0.7% |
| a/4 | 4 | 0.7% |
| Other values (406) | 480 |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 363 | |
| 1 | 346 | |
| 2 | 285 | |
| 7 | 259 | |
| 4 | 236 | 7.8% |
| 0 | 209 | 6.9% |
| 6 | 200 | 6.6% |
| 5 | 199 | 6.6% |
| 9 | 166 | 5.5% |
| 8 | 149 | 4.9% |
| Other values (25) | 621 |
| Value | Count | Frequency (%) |
| 3 | 379 | |
| 1 | 342 | |
| 2 | 309 | |
| 7 | 237 | 7.8% |
| 4 | 236 | 7.7% |
| 6 | 203 | 6.6% |
| 0 | 200 | 6.5% |
| 5 | 186 | 6.1% |
| 9 | 169 | 5.5% |
| 8 | 150 | 4.9% |
| Other values (21) | 645 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3033 |
| Value | Count | Frequency (%) |
| (unknown) | 3056 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 363 | |
| 1 | 346 | |
| 2 | 285 | |
| 7 | 259 | |
| 4 | 236 | 7.8% |
| 0 | 209 | 6.9% |
| 6 | 200 | 6.6% |
| 5 | 199 | 6.6% |
| 9 | 166 | 5.5% |
| 8 | 149 | 4.9% |
| Other values (25) | 621 |
| Value | Count | Frequency (%) |
| 3 | 379 | |
| 1 | 342 | |
| 2 | 309 | |
| 7 | 237 | 7.8% |
| 4 | 236 | 7.7% |
| 6 | 203 | 6.6% |
| 0 | 200 | 6.5% |
| 5 | 186 | 6.1% |
| 9 | 169 | 5.5% |
| 8 | 150 | 4.9% |
| Other values (21) | 645 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3033 |
| Value | Count | Frequency (%) |
| (unknown) | 3056 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 363 | |
| 1 | 346 | |
| 2 | 285 | |
| 7 | 259 | |
| 4 | 236 | 7.8% |
| 0 | 209 | 6.9% |
| 6 | 200 | 6.6% |
| 5 | 199 | 6.6% |
| 9 | 166 | 5.5% |
| 8 | 149 | 4.9% |
| Other values (25) | 621 |
| Value | Count | Frequency (%) |
| 3 | 379 | |
| 1 | 342 | |
| 2 | 309 | |
| 7 | 237 | 7.8% |
| 4 | 236 | 7.7% |
| 6 | 203 | 6.6% |
| 0 | 200 | 6.5% |
| 5 | 186 | 6.1% |
| 9 | 169 | 5.5% |
| 8 | 150 | 4.9% |
| Other values (21) | 645 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3033 |
| Value | Count | Frequency (%) |
| (unknown) | 3056 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 363 | |
| 1 | 346 | |
| 2 | 285 | |
| 7 | 259 | |
| 4 | 236 | 7.8% |
| 0 | 209 | 6.9% |
| 6 | 200 | 6.6% |
| 5 | 199 | 6.6% |
| 9 | 166 | 5.5% |
| 8 | 149 | 4.9% |
| Other values (25) | 621 |
| Value | Count | Frequency (%) |
| 3 | 379 | |
| 1 | 342 | |
| 2 | 309 | |
| 7 | 237 | 7.8% |
| 4 | 236 | 7.7% |
| 6 | 203 | 6.6% |
| 0 | 200 | 6.5% |
| 5 | 186 | 6.1% |
| 9 | 169 | 5.5% |
| 8 | 150 | 4.9% |
| Other values (21) | 645 |
Fare
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 182 | 175 |
| Distinct (%) | 40.8% | 39.2% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 30.477382 | 29.24032 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 512.3292 | 512.3292 |
| Zeros | 11 | 7 |
| Zeros (%) | 2.5% | 1.6% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 7.125 | 7.225 |
| Q1 | 7.8958 | 7.8958 |
| median | 13 | 13 |
| Q3 | 30 | 30.5 |
| 95-th percentile | 103.19375 | 86.5 |
| Maximum | 512.3292 | 512.3292 |
| Range | 512.3292 | 512.3292 |
| Interquartile range (IQR) | 22.1042 | 22.6042 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 50.774303 | 44.604556 |
| Coefficient of variation (CV) | 1.6659667 | 1.5254469 |
| Kurtosis | 39.898932 | 38.326349 |
| Mean | 30.477382 | 29.24032 |
| Median Absolute Deviation (MAD) | 5.75 | 5.4104 |
| Skewness | 5.3699376 | 5.0534316 |
| Sum | 13592.912 | 13041.183 |
| Variance | 2578.0298 | 1989.5664 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 8.05 | 26 | 5.8% |
| 13 | 21 | 4.7% |
| 7.8958 | 17 | 3.8% |
| 10.5 | 13 | 2.9% |
| 7.75 | 13 | 2.9% |
| 7.8542 | 12 | 2.7% |
| 0 | 11 | 2.5% |
| 7.925 | 11 | 2.5% |
| 26.55 | 9 | 2.0% |
| 26 | 9 | 2.0% |
| Other values (172) | 304 |
| Value | Count | Frequency (%) |
| 7.8958 | 24 | 5.4% |
| 7.75 | 23 | 5.2% |
| 8.05 | 22 | 4.9% |
| 13 | 20 | 4.5% |
| 10.5 | 15 | 3.4% |
| 26 | 13 | 2.9% |
| 7.925 | 10 | 2.2% |
| 7.8542 | 9 | 2.0% |
| 7.225 | 8 | 1.8% |
| 0 | 7 | 1.6% |
| Other values (165) | 295 |
| Value | Count | Frequency (%) |
| 0 | 11 | |
| 6.2375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.75 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 5 | |
| 7.125 | 3 | 0.7% |
| 7.1417 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 7 | |
| 5 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.75 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 6.975 | 1 | 0.2% |
| 7.05 | 3 | |
| 7.0542 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 7 | |
| 5 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.75 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 6.975 | 1 | 0.2% |
| 7.05 | 3 | |
| 7.0542 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 11 | |
| 6.2375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.75 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 5 | |
| 7.125 | 3 | 0.7% |
| 7.1417 | 1 | 0.2% |
Cabin
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 85 | 77 |
| Distinct (%) | 88.5% | 81.1% |
| Missing | 350 | 351 |
| Missing (%) | 78.5% | 78.7% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 11 | 15 |
| Median length | 3 | 3 |
| Mean length | 3.7291667 | 3.7894737 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 358 | 360 |
| Distinct characters | 18 | 19 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 75 | 62 ? |
| Unique (%) | 78.1% | 65.3% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | D45 | A24 |
| 2nd row | A6 | B94 |
| 3rd row | B38 | A34 |
| 4th row | G6 | C128 |
| 5th row | E8 | A20 |
| Value | Count | Frequency (%) |
| c23 | 3 | 2.6% |
| c27 | 3 | 2.6% |
| f | 3 | 2.6% |
| c25 | 3 | 2.6% |
| b35 | 2 | 1.8% |
| e101 | 2 | 1.8% |
| g6 | 2 | 1.8% |
| d35 | 2 | 1.8% |
| e44 | 2 | 1.8% |
| c92 | 2 | 1.8% |
| Other values (86) | 90 |
| Value | Count | Frequency (%) |
| g6 | 4 | 3.4% |
| c26 | 3 | 2.6% |
| c22 | 3 | 2.6% |
| b55 | 2 | 1.7% |
| f | 2 | 1.7% |
| c123 | 2 | 1.7% |
| f2 | 2 | 1.7% |
| c68 | 2 | 1.7% |
| e44 | 2 | 1.7% |
| b35 | 2 | 1.7% |
| Other values (78) | 92 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 41 | |
| C | 40 | |
| 1 | 33 | 9.2% |
| 3 | 26 | 7.3% |
| 6 | 24 | 6.7% |
| 5 | 23 | 6.4% |
| B | 23 | 6.4% |
| 4 | 21 | 5.9% |
| D | 18 | 5.0% |
| 18 | 5.0% | |
| Other values (8) | 91 |
| Value | Count | Frequency (%) |
| 2 | 41 | |
| C | 35 | |
| B | 34 | |
| 1 | 28 | 7.8% |
| 5 | 28 | 7.8% |
| 3 | 27 | 7.5% |
| 6 | 27 | 7.5% |
| 21 | 5.8% | |
| 4 | 19 | 5.3% |
| E | 16 | 4.4% |
| Other values (9) | 84 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 358 |
| Value | Count | Frequency (%) |
| (unknown) | 360 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 2 | 41 | |
| C | 40 | |
| 1 | 33 | 9.2% |
| 3 | 26 | 7.3% |
| 6 | 24 | 6.7% |
| 5 | 23 | 6.4% |
| B | 23 | 6.4% |
| 4 | 21 | 5.9% |
| D | 18 | 5.0% |
| 18 | 5.0% | |
| Other values (8) | 91 |
| Value | Count | Frequency (%) |
| 2 | 41 | |
| C | 35 | |
| B | 34 | |
| 1 | 28 | 7.8% |
| 5 | 28 | 7.8% |
| 3 | 27 | 7.5% |
| 6 | 27 | 7.5% |
| 21 | 5.8% | |
| 4 | 19 | 5.3% |
| E | 16 | 4.4% |
| Other values (9) | 84 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 358 |
| Value | Count | Frequency (%) |
| (unknown) | 360 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 2 | 41 | |
| C | 40 | |
| 1 | 33 | 9.2% |
| 3 | 26 | 7.3% |
| 6 | 24 | 6.7% |
| 5 | 23 | 6.4% |
| B | 23 | 6.4% |
| 4 | 21 | 5.9% |
| D | 18 | 5.0% |
| 18 | 5.0% | |
| Other values (8) | 91 |
| Value | Count | Frequency (%) |
| 2 | 41 | |
| C | 35 | |
| B | 34 | |
| 1 | 28 | 7.8% |
| 5 | 28 | 7.8% |
| 3 | 27 | 7.5% |
| 6 | 27 | 7.5% |
| 21 | 5.8% | |
| 4 | 19 | 5.3% |
| E | 16 | 4.4% |
| Other values (9) | 84 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 358 |
| Value | Count | Frequency (%) |
| (unknown) | 360 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 2 | 41 | |
| C | 40 | |
| 1 | 33 | 9.2% |
| 3 | 26 | 7.3% |
| 6 | 24 | 6.7% |
| 5 | 23 | 6.4% |
| B | 23 | 6.4% |
| 4 | 21 | 5.9% |
| D | 18 | 5.0% |
| 18 | 5.0% | |
| Other values (8) | 91 |
| Value | Count | Frequency (%) |
| 2 | 41 | |
| C | 35 | |
| B | 34 | |
| 1 | 28 | 7.8% |
| 5 | 28 | 7.8% |
| 3 | 27 | 7.5% |
| 6 | 27 | 7.5% |
| 21 | 5.8% | |
| 4 | 19 | 5.3% |
| E | 16 | 4.4% |
| Other values (9) | 84 |
Embarked
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 1 | 0 |
| Missing (%) | 0.2% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| S | |
|---|---|
| C | |
| Q | 32 |
| S | |
|---|---|
| C | |
| Q |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 445 | 446 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | S | S |
| 2nd row | C | S |
| 3rd row | S | S |
| 4th row | S | S |
| 5th row | S | S |
Common Values
| Value | Count | Frequency (%) |
| S | 332 | |
| C | 81 | 18.2% |
| Q | 32 | 7.2% |
| (Missing) | 1 | 0.2% |
| Value | Count | Frequency (%) |
| S | 315 | |
| C | 90 | 20.2% |
| Q | 41 | 9.2% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| s | 332 | |
| c | 81 | 18.2% |
| q | 32 | 7.2% |
| Value | Count | Frequency (%) |
| s | 315 | |
| c | 90 | 20.2% |
| q | 41 | 9.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 332 | |
| C | 81 | 18.2% |
| Q | 32 | 7.2% |
| Value | Count | Frequency (%) |
| S | 315 | |
| C | 90 | 20.2% |
| Q | 41 | 9.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| S | 332 | |
| C | 81 | 18.2% |
| Q | 32 | 7.2% |
| Value | Count | Frequency (%) |
| S | 315 | |
| C | 90 | 20.2% |
| Q | 41 | 9.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| S | 332 | |
| C | 81 | 18.2% |
| Q | 32 | 7.2% |
| Value | Count | Frequency (%) |
| S | 315 | |
| C | 90 | 20.2% |
| Q | 41 | 9.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| S | 332 | |
| C | 81 | 18.2% |
| Q | 32 | 7.2% |
| Value | Count | Frequency (%) |
| S | 315 | |
| C | 90 | 20.2% |
| Q | 41 | 9.2% |
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
| Age | Embarked | Fare | Parch | PassengerId | Pclass | Sex | SibSp | Survived | |
|---|---|---|---|---|---|---|---|---|---|
| Age | 1.000 | 0.000 | 0.133 | -0.248 | 0.017 | 0.238 | 0.160 | -0.122 | 0.153 |
| Embarked | 0.000 | 1.000 | 0.175 | 0.039 | 0.032 | 0.221 | 0.159 | 0.057 | 0.204 |
| Fare | 0.133 | 0.175 | 1.000 | 0.359 | 0.008 | 0.455 | 0.165 | 0.427 | 0.316 |
| Parch | -0.248 | 0.039 | 0.359 | 1.000 | 0.006 | 0.000 | 0.241 | 0.440 | 0.117 |
| PassengerId | 0.017 | 0.032 | 0.008 | 0.006 | 1.000 | 0.007 | 0.000 | -0.033 | 0.165 |
| Pclass | 0.238 | 0.221 | 0.455 | 0.000 | 0.007 | 1.000 | 0.125 | 0.115 | 0.347 |
| Sex | 0.160 | 0.159 | 0.165 | 0.241 | 0.000 | 0.125 | 1.000 | 0.261 | 0.523 |
| SibSp | -0.122 | 0.057 | 0.427 | 0.440 | -0.033 | 0.115 | 0.261 | 1.000 | 0.183 |
| Survived | 0.153 | 0.204 | 0.316 | 0.117 | 0.165 | 0.347 | 0.523 | 0.183 | 1.000 |
Dataset B
| Age | Embarked | Fare | Parch | PassengerId | Pclass | Sex | SibSp | Survived | |
|---|---|---|---|---|---|---|---|---|---|
| Age | 1.000 | 0.085 | 0.107 | -0.253 | 0.054 | 0.262 | 0.092 | -0.177 | 0.137 |
| Embarked | 0.085 | 1.000 | 0.169 | 0.060 | 0.073 | 0.244 | 0.159 | 0.088 | 0.195 |
| Fare | 0.107 | 0.169 | 1.000 | 0.442 | -0.015 | 0.488 | 0.219 | 0.500 | 0.297 |
| Parch | -0.253 | 0.060 | 0.442 | 1.000 | -0.001 | 0.000 | 0.251 | 0.469 | 0.131 |
| PassengerId | 0.054 | 0.073 | -0.015 | -0.001 | 1.000 | 0.111 | 0.088 | -0.081 | 0.000 |
| Pclass | 0.262 | 0.244 | 0.488 | 0.000 | 0.111 | 1.000 | 0.115 | 0.120 | 0.278 |
| Sex | 0.092 | 0.159 | 0.219 | 0.251 | 0.088 | 0.115 | 1.000 | 0.180 | 0.542 |
| SibSp | -0.177 | 0.088 | 0.500 | 0.469 | -0.081 | 0.120 | 0.180 | 1.000 | 0.129 |
| Survived | 0.137 | 0.195 | 0.297 | 0.131 | 0.000 | 0.278 | 0.542 | 0.129 | 1.000 |
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 372 | 373 | 0 | 3 | Beavan, Mr. William Thomas | male | 19.0 | 0 | 0 | 323951 | 8.0500 | NaN | S |
| 34 | 35 | 0 | 1 | Meyer, Mr. Edgar Joseph | male | 28.0 | 1 | 0 | PC 17604 | 82.1708 | NaN | C |
| 740 | 741 | 1 | 1 | Hawksford, Mr. Walter James | male | NaN | 0 | 0 | 16988 | 30.0000 | D45 | S |
| 23 | 24 | 1 | 1 | Sloper, Mr. William Thompson | male | 28.0 | 0 | 0 | 113788 | 35.5000 | A6 | S |
| 536 | 537 | 0 | 1 | Butt, Major. Archibald Willingham | male | 45.0 | 0 | 0 | 113050 | 26.5500 | B38 | S |
| 251 | 252 | 0 | 3 | Strom, Mrs. Wilhelm (Elna Matilda Persson) | female | 29.0 | 1 | 1 | 347054 | 10.4625 | G6 | S |
| 809 | 810 | 1 | 1 | Chambers, Mrs. Norman Campbell (Bertha Griggs) | female | 33.0 | 1 | 0 | 113806 | 53.1000 | E8 | S |
| 564 | 565 | 0 | 3 | Meanwell, Miss. (Marion Ogden) | female | NaN | 0 | 0 | SOTON/O.Q. 392087 | 8.0500 | NaN | S |
| 596 | 597 | 1 | 2 | Leitch, Miss. Jessie Wills | female | NaN | 0 | 0 | 248727 | 33.0000 | NaN | S |
| 173 | 174 | 0 | 3 | Sivola, Mr. Antti Wilhelm | male | 21.0 | 0 | 0 | STON/O 2. 3101280 | 7.9250 | NaN | S |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 12 | 13 | 0 | 3 | Saundercock, Mr. William Henry | male | 20.0 | 0 | 0 | A/5. 2151 | 8.0500 | NaN | S |
| 316 | 317 | 1 | 2 | Kantor, Mrs. Sinai (Miriam Sternin) | female | 24.0 | 1 | 0 | 244367 | 26.0000 | NaN | S |
| 272 | 273 | 1 | 2 | Mellinger, Mrs. (Elizabeth Anne Maidment) | female | 41.0 | 0 | 1 | 250644 | 19.5000 | NaN | S |
| 433 | 434 | 0 | 3 | Kallio, Mr. Nikolai Erland | male | 17.0 | 0 | 0 | STON/O 2. 3101274 | 7.1250 | NaN | S |
| 696 | 697 | 0 | 3 | Kelly, Mr. James | male | 44.0 | 0 | 0 | 363592 | 8.0500 | NaN | S |
| 656 | 657 | 0 | 3 | Radeff, Mr. Alexander | male | NaN | 0 | 0 | 349223 | 7.8958 | NaN | S |
| 416 | 417 | 1 | 2 | Drew, Mrs. James Vivian (Lulu Thorne Christian) | female | 34.0 | 1 | 1 | 28220 | 32.5000 | NaN | S |
| 120 | 121 | 0 | 2 | Hickman, Mr. Stanley George | male | 21.0 | 2 | 0 | S.O.C. 14879 | 73.5000 | NaN | S |
| 629 | 630 | 0 | 3 | O'Connell, Mr. Patrick D | male | NaN | 0 | 0 | 334912 | 7.7333 | NaN | Q |
| 603 | 604 | 0 | 3 | Torber, Mr. Ernst William | male | 44.0 | 0 | 0 | 364511 | 8.0500 | NaN | S |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 677 | 678 | 1 | 3 | Turja, Miss. Anna Sofia | female | 18.0 | 0 | 0 | 4138 | 9.8417 | NaN | S |
| 601 | 602 | 0 | 3 | Slabenoff, Mr. Petco | male | NaN | 0 | 0 | 349214 | 7.8958 | NaN | S |
| 621 | 622 | 1 | 1 | Kimball, Mr. Edwin Nelson Jr | male | 42.0 | 1 | 0 | 11753 | 52.5542 | D19 | S |
| 818 | 819 | 0 | 3 | Holm, Mr. John Fredrik Alexander | male | 43.0 | 0 | 0 | C 7075 | 6.4500 | NaN | S |
| 22 | 23 | 1 | 3 | McGowan, Miss. Anna "Annie" | female | 15.0 | 0 | 0 | 330923 | 8.0292 | NaN | Q |
| 844 | 845 | 0 | 3 | Culumovic, Mr. Jeso | male | 17.0 | 0 | 0 | 315090 | 8.6625 | NaN | S |
| 711 | 712 | 0 | 1 | Klaber, Mr. Herman | male | NaN | 0 | 0 | 113028 | 26.5500 | C124 | S |
| 465 | 466 | 0 | 3 | Goncalves, Mr. Manuel Estanslas | male | 38.0 | 0 | 0 | SOTON/O.Q. 3101306 | 7.0500 | NaN | S |
| 605 | 606 | 0 | 3 | Lindell, Mr. Edvard Bengtsson | male | 36.0 | 1 | 0 | 349910 | 15.5500 | NaN | S |
| 864 | 865 | 0 | 2 | Gill, Mr. John William | male | 24.0 | 0 | 0 | 233866 | 13.0000 | NaN | S |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 309 | 310 | 1 | 1 | Francatelli, Miss. Laura Mabel | female | 30.0 | 0 | 0 | PC 17485 | 56.9292 | E36 | C |
| 566 | 567 | 0 | 3 | Stoytcheff, Mr. Ilia | male | 19.0 | 0 | 0 | 349205 | 7.8958 | NaN | S |
| 55 | 56 | 1 | 1 | Woolner, Mr. Hugh | male | NaN | 0 | 0 | 19947 | 35.5000 | C52 | S |
| 830 | 831 | 1 | 3 | Yasbeck, Mrs. Antoni (Selini Alexander) | female | 15.0 | 1 | 0 | 2659 | 14.4542 | NaN | C |
| 327 | 328 | 1 | 2 | Ball, Mrs. (Ada E Hall) | female | 36.0 | 0 | 0 | 28551 | 13.0000 | D | S |
| 57 | 58 | 0 | 3 | Novel, Mr. Mansouer | male | 28.5 | 0 | 0 | 2697 | 7.2292 | NaN | C |
| 662 | 663 | 0 | 1 | Colley, Mr. Edward Pomeroy | male | 47.0 | 0 | 0 | 5727 | 25.5875 | E58 | S |
| 790 | 791 | 0 | 3 | Keane, Mr. Andrew "Andy" | male | NaN | 0 | 0 | 12460 | 7.7500 | NaN | Q |
| 553 | 554 | 1 | 3 | Leeni, Mr. Fahim ("Philip Zenni") | male | 22.0 | 0 | 0 | 2620 | 7.2250 | NaN | C |
| 598 | 599 | 0 | 3 | Boulos, Mr. Hanna | male | NaN | 0 | 0 | 2664 | 7.2250 | NaN | C |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||